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INTRODUCTION  AND  SUMMARY 

Asymptotic  approximations  of  optimal  control  laws  are 
determined  here  for  a  class  of  multivariate  dynamic  systems  in 
which  the  controller  has  only  noisy  measurements  of  system  state 
components  whose  second  time-derivatives,  but  not  first,  can  be 
directly  affected  by  plant  noise  or  the  control.  The  control 
optimization  problem  for  these  cases  would  have  the  standard 
linear-quadratic-Gaussian  form  except  for  certain  small 
nonlinearities  involving  slowly  varying  parameters,  which  are 
treated  as  components  of  an  argumented  state  vector.  Also,  the 
measurement  noise  is  small  in  a  certain  relative  sense,  which  gives 
this  control  problem  special  properties. 

A  special  case  of  this  problem,  which  arises  in  homing  missile 
guidance,  was  treated  in  Reference  1.  The  only  nonlinearity  in  that 
case  is  a  term  in  the  state  measurement  equation  that  is  bilinear  in 
the  parameter  and  control  (both  scalars)  and  gives  rise  to  a  rapidly 
varying  term  in  the  optimal  control  law.  This  rapidly  varying  term 
is  generated  as  the  output  of  a  critically  damped  second-order 
system  driven  by  a  Kalman  filter  innovation  variable. 

The  methods  used  in  Reference  1  depended  on  special  features 
of  the  case  treated  there;  however,  the  same  basic  approach  can  be 
applied  here  with  some  modification.  The  result  in  this  more 
general  case  is  that — to  the  level  of  accuracy  retained  in  the 
asymptotic  approximations — the  same  sort  of  rapidly  varying  term 
appears  in  the  optimal  control  law.  This  term  results  from  bilinear 
measurement  terms  in  the  control  and  parameter  variables,  but  not 
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from  other  nonlinearities  considered  here.  In  general,  this  extra 
control  term  is  a  linear  function  of  the  output  of  a  multivariate 
linear  system  driven  by  a  Kalman  filter  innovation  variable.  These 
results  are  applied  to  the  design  of  an  adaptive  pitch  autopilot  for  a 
missile  as  a  numerical  example. 


NOTATION 


Unless  otherwise  stated,  lower  case  letters  denote  (real  finite¬ 
dimensional)  column  vectors  and  scalars.  Matrices  are  denoted  by 
capital  Roman  letters.  AT  denotes  the  transpose  of  a  matrix  A,  and 
tr(A)  its  trace  if  A  is  square. 


It  will  be  convenient  to  make  use  of  three-way  matrices, 
which  are  always  denoted  by  capital  Greek  letters  here.  For 
continuity  of  notation,  the  following  definitions  are  adopted  for  such 
a  three-way  matrix  F,  with  vector  x  and  matrices  A  and  B  of 
compatible  dimensions,  and  with  repeated  indices  denoting 
summation: 


(rx)jj  —  rjjjjXjy 
(AxT)jjj^  =  AjjXk 
(AF)jjjj  =  Ajjjrg.jJj 

(rB)ijk  =  ^ija^ok 

(r')ijk  =  Tjki  and  (rT)jjk  =  Fkji 

MD],  = 


(matrix) 

(three-way  matrix) 

(three-way  matrix) 

(three-way  matrix) 

(three-way  matrices) 

(column  vector,  when 
applicable) 


With  these  definitions,  the  expression  AFFBDxxT  is  fully  associative. 
Many  other  consequences  are  obvious.  Some  useful  but  less  obvious 
properties  are 
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tr(rx)  =  [tr(r)]Tx 
A  tr(r')  =  tr(Ar') 


tr(Ar)  =  tr(rA) 

(TB)'  =  BTr  and  (AT)"  =  T'AT 
(ArB)T  =  BTrrAT 

(rx)AT  =  (Ar)'x  and  (r'x)B  =  (rB)"x. 

Partial  derivatives  of  a  scalar  s  with  respect  to  a  matrix  A  and 
vectors  x  and  y  are  denoted  by  subscripts,  with  the  convention  that 


(Sxy)ij  — 


3^s 

axjayj  ’ 


and 


a^s 


aAjjaxk 


PROBLEM  AND  BASIC  APPROACH 
The  problem  treated  here  involves  a  system  with  dynamics 

x  =  Fx  +  v  (1) 

V  =  Kx  +  Dv  +  Gu  +  w,  (2) 

a  controller  of  which  receives  the  vector  measurement 
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z  =  X  +  h  tr(r"0u’r)  +  n  (3) 

and  selects  the  control  vector  u  at  each  time  instant  t  ^  0.  The  time 
variable  t  is  suppressed  in  the  notation  here,  and  the  coefficient 
matrices  may  be  time-varying.  9  is  a  constant  but  unknown 
parameter  vector,  w  and  n  are  zero-mean  Gaussian  white  noise 
processes  with  respective  covariance  parameters  Q  and  R/m'^,  and  h 
and  m  are  positive  scalars  such  that 

h  «  —  «  1.  (4) 

m 


A  prioriy 


■x(0)‘ 

f 

o' 

>10 

0 

o' 

v(0) 

is  a  Normal 

0 

♦ 

0 

P30 

0 

0 

K 

.0. 

0 

0 

Lo. 

y 

(5) 


random  variable  independent  of  w  and  n.  The  objective  is  to  find  a 
control  law  that  minimizes  the  scalar  performance  criterion 


r  T  Ti 

>1 

S2' 

Xf 

[Xf  Vf J 

k 

.S2 

.Vf. 

+ 


A2 

A3. 


■ 

X 

+  u’^Bu 

dt 

.V. 

) 

(6) 


where  E  denotes  prior  expectation  and  tf  >  0  is  some  specified 
terminal  time.  As  usual,  a  control  law  is  defined  as  a  decision  rule 
that,  for  each  t  in  [0,  tf),  specifies  the  current  control  u(t)  as  a 
function  of  the  current  measurement  history  {(z(\j/),  v)  :  0  v  <  t). 
Also,  in  the  above,  Pjot  P30»  Lq  are  positive  definite,  B(t)  and  R(t) 


are  positive  definite  for  each  t  e  [0,  tf]. 


Si  82' 

Sj  S3. 


is  positive- 
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semidefinite,  and  Q(t)  and 


AjCt)  A2(t) 
Ajd)  Ajd) 


are  positive-semidefinite 


for  each  t  €  [0,  t^].  Without  loss  of  generality,  these  matrices  are 
assumed  symmetric  as  well. 


Finding  such  an  optimal  control  law  is  very  difficult,  so  we 
only  consider  the  problem  of  finding  an  approximation  thereof  that 
is  asymptotically  accurate  to  order  h2m3/2  for  the  inequalities  of 
Equation  4,  i.e.,  when  1/m  and  mh  are  both  small.  What  is  meant  by 
such  an  approximation  is  that  the  control  law  always  generates  a 
control  value  u  which  is  the  same  to  order  h2m3/2  as  that  generated 
by  an  optimal  control  law,  except  perhaps  for  a  set  of  measurement 
histories  of  negligibly  small  probability.  The  size  of  m  and  the  size 
of  l/(mh)  if  h  ^  0  are  considered  to  be  large  enough  here  that  the 


components  of  F,  K,  D,  G,  F, 
R*^,  B,  B*^,  and  their  time 
unity  by  comparison. 


^10»  ^30’  ^0»  ^2»  S3,  Aj,  A2,  A3,  Q,  Q-1,  R, 

rates  of  change,  if  any,  are  always  of  order 


Also,  the  treatment  of  this  problem  is  limited  here  to  finding 
the  control  law  associated  with  a  cost-to-go  function  which  has  the 
formal  appearance  of  satisfying  the  Bellman  equation  corresponding 
to  Equations  1-3,  5,  and  6  to  order  h2m3/2.  This  control  law  would 
be  the  desired  asymptotic  approximation  if  the  equations  involved 
in  the  analysis  are  well  posed  and  the  formally  higher-order  terms 
in  them  are  indeed  so  in  some  appropriate  sense.  A  mathematically 
precise  verification  of  these  conditions  is  beyond  the  scope  of  this 
investigation,  however,  so  in  this  sense  the  control  law  obtained 
here  is  only  a  plausible  candidate  for  the  approximation  being 
sought.  This  plausibility  is  enhanced,  though,  by  the  fact  that  the 
actual  optimal  control  law  is  well  known  and  rigorously  justified  for 
h  =  0  (a  standard  linear-quadratic-Gaussian  case)  and  the 
approximation  derived  here  for  small  h  converges  to  this  control  law 
as  h  ->  0.  Nevertheless,  it  is  still  important  to  augment  this  type  of 
formal  analysis  by  testing  the  results  on  specific  numerical 
examples.  One  such  example  is  included  here,  and  the  theory  seems 
to  give  reasonable  and  useful  results  in  this  case. 
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Even  formally,  the  asymptotic  accuracy  of  this  control  law 
approximation  is  less  than  that  obtained  for  the  special  case 
examined  in  Reference  1,  where  all  control  terms  of  order  h^  were 
included  in  the  approximation.  As  it  happened,  the  other  order-h^ 
control  terms  had  an  equally  important  effect  on  the  performance 
criterion  even  though  they  were  small  compared  to  h2m3/2.  For  the 
more  limited  purpose  of  investigating  the  salient  features  of  control 
laws  that  are  optimal  for  such  criteria,  however,  it  is  consistent  to 
limit  the  accuracy  of  the  control  law  approximations  here  to  order 
h2m3/2.  As  in  the  example  below,  the  performance  criterion  is  often 
used  only  as  a  device  to  generate,  by  its  optimization,  a  control  law 
with  desired  properties. 


MOTION-STATE  AND  PARAMETER  ESTIMATION 

The  motion  state  (x,  v)  of  the  dynamic  system  and  the 
parameter  vector  0  satisfy  the  linear  system  of  equations 


X 

'f  I  o' 

X 

‘O' 

'o' 

V 

= 

K  D  0 

V 

+ 

G 

u  + 

I 

.0. 

.0  0  0, 

.0. 

.0. 

.0. 

w 


(7) 


Since  the  initial  value  of  the  composite  state  (x,  v,  0)  has  a  Normal 
prior  probability  distribution  and  since  current  and  past  values  of  u 
are  presumed  known  to  the  controller,  it  is  a  standard  result 
(Reference  2)  that  the  current  conditional  probability  distribution  of 
this  composite  state,  given  current  and  past  values  of  z,  is  also 
Normal,  with  mean  and  covariance  matrix  given  by  the  Kalman  filter 
equations  for  Equations  3,  5,  and  7.  If  this  conditional  mean  and 
covariance  matrix  are  partitioned  in  the  obvious  way  as 


X 

>1  P2  E,' 

V 

and 

pI  P,  E2 

L0j 

1 

r 

1 _ 

these  Kalman  filter  equations  can  be  expressed  as 
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X  =  Fx  +  V  +  m^[Pi  +  h(r^Ey)'u]R  kz  -  z)  ;  x(0)  =  0  (8) 

V  =  Kx  +  Dv  +  Gu  +  m^[P‘5  +  h(r'^E5'u]R-'(z  -  z) ; 

v(0)  =  0  (9) 

0  =  m'^LEj  +  hCr'^D'ulR'kz  -  z) ;  0(0)  =0  (10) 

Pi  =  FPi  +  PiF”^  +  P2  +  P2  -  m'^[Pi+  h(r'^ET)'u]R"^ 

x[Pi  +  h(Eir)'u];Pi(0)=Pio  (11) 

P2  =  FP2  +  pId"^  +  PiK”^  +  P3 


-  m^[Pi  +  h(r'^ET)'u]R"^[P2  +  h(E2r)'u];  P2(0)  =  0  (12) 

P3  =  KP2  +  pIk”^  +  DP3  +  P3D’^  +  Q  - 

[P'5  +  h(r’^E5'u]R-^[P2+  h(E2r)'u];  P3(0)  =  P30  (13) 
El  =  FEi  +  E2  -  m'*[Pi  +  h(r'^E’f)'u]R"^ 

X  [El  +  MLD'uJ;  Ei(0)  =  0  (14) 

^  =  KEi  +  DE2  -  m'^LPj  +  hCr’^E^'uJR"^ 

X  [E,  +  MLD'u];  E2(0)  =  0  (15) 

L  =  -ni'^rE]’  +  h(r'^L)'u]R‘*[E,  +  hCLD'u];  L(0)  =  Lq  ,  (16) 

where 
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z  =  X  +  h  tr(r"0u^).  (17) 

It  is  also  convenient  to  define 

^=m^(z-z),  (18) 

which  is  the  normalized  innovation  process  for  this  filter.  As  such,  % 
can  be  treated  as  a  zero-mean  Gaussian  white  noise  process  with 
covariance  parameter  R  in  determining  the  statistical  behavior  of  x, 

A  ^ 

V,  and  0  (Reference  3). 

It  happens  that  L  varies  more  slowly  than  the  other 
covariance  matrix  partitions.  A  key  step  that  takes  advantage  of 
this  is  to  define  the  nominal  time  functions  Pj,  P2,  and  P3  for  t  >  0 
by 

Pi  =  FPi  +  PiF'*'+  P2+  Pj-  m'^PiR-^Pi  ;  Pi(0)  =  Pi(0)  (19) 
P2  =  FP2  +  +  PiK”^  + 

P3  -  m^PiR"^P2  ;  P2(0)  =  P2(0)  (20) 

P3  =  KP2  +  pJk^  +  DP3  +  P3D’^  + 

Q  -  m^PjR“^P2;  P3(0)  =  P3(0)  (21 ) 

and  let 

Ni  =  —  (Pi  -  Pi)  -  mu’^r'T.r’^u 
h^ 


(22) 


(23) 
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No=' 


(P2  -  h) 


N,= 


(P3  -  P3) 


(24) 


Ml  =  —  [El  +  h(Lr)'u] 
h 


(25) 


M-,  =  — . 


(26) 


It  follows  fairly  directly  from  known  properties  of  conditional 
covariance  and  precision  matrices  for  multivariate  Normal 
distributions  (Reference  4)  that  the  conditions  imposed  on  Q  and  R 
in  the  preceding  section  imply  that 


All  components  of  P]  are  of  order  l/m3. 

All  components  of  PJ^  are  of  order  m3, 

All  components  of  P2  are  of  order  l/m2,  and 


All  components  of  P3  are  of  order  1/m, 

except  perhaps  for  initial  transients  with  durations  of  order  1/m. 
These  magnitudes  are  established  by  considering  the  estimation 
problem  for  F  =  0  and  its  usual  dual  for  the  precision  matrix,  and, 
for  each  i,  deleting  all  measurements  except  Zj  in  bounding  the 
variances  of  xj  and  vj  (and  likewise  in  the  dual  problem). 
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APPROXIMATE  ESTIMATOR  BEHAVIOR  FOR 
A  CLASS  OF  CONTROL  LAWS 

If  h  were  zero,  it  is  a  standard  result  that  the  optimal  control 
law  for  Equations  1-6  would  be  of  the  form  u  =  -Hx  -  Wv,  where  H(t) 

and  W(t)  are  certain  deterministic  time  functions  such  that  H,  W,  H, 

and  W  are  all  of  order  unity.  Since  we  are  only  concerned  with 
small  h  here,  we  consider  control  laws  of  the  form 

u  =  -Hx  -  Wv  +1]  (27) 

for  which  H  and  W  are  deterministic  time  functions,  to  be  chosen  for 

convenience  later,  such  that  H,  W,  H,  and  W  are  of  order  unity,  and 
for  which  the  components  of  “n  are  small  compared  to  unity,  except 
perhaps  for  a  negligibly  improbable  set  of  realizations.  For  such  a 
control  law,  it  follows  from  Equations  8,  9,  16,  and  18  through  27 
that 


L  =  -(mh)^MTR”^Mi,  (28) 

X  =  Fx  +  V  +  —  jn^^Pi 
m  I 

+  (mh)^  [Ni  -  Mir'^'(Hx  +  Wv  -  ti)]|r"^^  ,  (29) 


and 


V  =  (K  -  GH)x  +  iD-  GW)v  +  jm^Pj 

+  (mh^)[N2-M2r'^'(Hx-h  Wv-Ti)]|R-^t  (30) 
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Expressions  for  Mi,M2,  Ni,N2’  ^3  can  also  be  obtained  for  this 
case  by  differentiating  Equations  22-27  and  substituting  from 
Equations  8-16  and  18-21.  These  expressions  are  quite  lengthy  in 
their  entirety;  however,  retaining  only  the  terms  that  are  needed  to 
determine  the  optimal  control  to  order  h2m3/2  reduces  them  to 


Ml  =  m[M2  -  m^PiR“^Mi  -  m^lLD'WPjR"^^], 

(31) 

M2  =  -m^PjR-^Mi, 

(32) 

Ni  =  m(N2  +  nJ)  -  m'‘[PiR'^Ni  -1-  NjR'^Pi 

-  (MiTR'^Pi  +  PiR‘*r’^Mj)' (Hx -I-  Wv-q)],  (33) 


N2  =  m[N3  -  m^PiR"^N2-  m^NiR"^P2 

(m^M2rR'^Pl  +  m¥ jR-^r’^Mf)' (Hx -I-  Wv-q)],(34) 


and 


N3  =  DN3  -t-  N3D'^  -  m^[pjR'^N2  +  nJR'^Pz 

-  (PjR'^r’^Ml  -I-  M2rR-^P2)'  (Hx  +  Wv  -  q)].  (35) 

Establishing  that  these  truncations  are  sufficiently  accurate  uses  the 

orders  of  magnitude  established  earlier  for  Pj,  P2,  P3,  and  PJ^  and 
follows  a  multivariate  version  of  the  corresponding  analysis  in 
Reference  1.  This  basically  proceeds  by  assuming  appropriate 
orders  of  magnitude  for  all  the  quantities  involved  and  showing  that 
no  order-of-magnitude  contradictions  occur  in  any  of  the 
(untruncated)  equations  above  or  in  the  Bellman  equation  and 
approximate  solution  of  the  next  section.  It  also  entails  analyzing 
Equations  29-35  as  a  noise-driven  system  to  conclude  by  standard 
methods  (Reference  3)  that  the  Mj,  M2,  Nj,  N2,  and  N3  components 
are  all  random  processes  with  values  of  order  m^/2^  except  perhaps 
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for  an  initial  time  interval  of  order  1/m,  which  become 
approximately  uncorrelated  over  a  time  interval  of  order  1/m. 
Since  this  and  Equation  28  imply  that  L  only  changes  by  order  (mh)2 
during  the  correlation  time  of  Mj,  it  also  follows  from  this  argument 
that  the  difference  (componentwise)  between  L  and  its  prior 
expected  value  for  such  a  control  law  is  always  small  compared  to 
unity  (except  for  a  set  of  realizations  of  negligible  probability).  The 
reason  is  that  order-unity  changes  in  an  L-component  behave 
basically  as  the  sum  of  l/(mh)2  independent  random  increments, 
each  with  mean  of  order  m  and  variance  of  order  m2.  Hence,  the 
variance  of  this  sum  is  of  order  (mh)2,  which  is  small  compared  to 
unity  by  assumption. 


CONTROL  OPTIMIZATION 

Since  H(t)  and  W(t)  in  Equation  27  are  considered  specified, 
the  problem  here  reduces  to  that  of  finding  an  optimal  control  law 
for  the  perturbation  control  “n  to  which  we  seek  only  an  asymptotic 
approximation.  A  convenient  choice  of  H  and  W  will  be  used  for  this 

purpose,  but  one  for  which  H,  W,  H,  and  W  are  of  order  unity. 

An  optimal  expected  cost-to-go  function  can  be  defined 
consistently  in  terms  of  time  and  the  conditional  distribution  of  x,  v, 
and  0  (Reference  5).  Thus,  the  Principle  of  Optimality  of  dynamic 
programming  can  be  applied  in  the  usual  way  (Reference  6)  to 
derive  a  Bellman  equation  for  this  function,  the  solution  of  which 
specifies  the  optimal  control  law  for  t)  .  Since  the  conditional 
distribution  here  is  Normal  and  therefore  specified  by  its  first  and 

second  moments,  such  a  solution  can  be  expressed  in  terms  of  t,  x,  v, 
0,  Mj,  M2,  Nj,  N2,  N3,  and  L.  The  derivation  of  the  Bellman  equation 
for  this  class  of  cost  functions  requires  the  conditional  expected 
values  of  increments  Ax,  Av,  A0,  AL,  AMj,  AM2,  ANj,  AN2,  AN3,  and  of 
quadratic  products  of  their  components,  over  an  infinitesimal  time 
increment  At,  given  the  data  up  to  the  beginning  of  this  time 
increment.  Since  this  conditioning  is  equivalent  to  conditioning  on 
the  conditional  distribution  of  x,  v,  and  0  at  that  time,  these 
expectations  can  be  evaluated  from  (the  untruncated  versions  of) 
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Equations  28-35  and  the  corresponding  equation  for  0  (which  will 
not  be  needed  for  the  level  of  accuracy  retained  below).  Here,  Ax  is 

taken  as  \At,  etc. 


In  so  doing,  we  retain  only  terms  up  to  order  h2m3/2  and 
h2m3/2y  (y  any  product  of  q -components)  in  the  resulting  Bellman 
equation.  Also,  we  restrict  consideration  to  possible  solutions  (also 
denoted  J)  of  the  form 


iNi  +  Q2N2 


Q3  +  ~S3jN3]  +  [x’^v’^]  tr(A[Mi:  M2])|  +  f(t),  (36) 


where  the  S,  Q,  and  A  components  are  all  of  order  unity  and 
functions  of  t  only,  with  Sj,  S3,  Qj,  and  Q3  symmetric,  and  for  which 

q,  the  time-derivative  of  the  associated  optimal  perturbation  control, 
contributes  only  terms  small  compared  to  h2m3/2  to  the  Bellman 
equation  for  the  choices 


H  =  B'^G'^sJ 
W  =  B~^G’^S3 


(37) 


These  restrictions  and  choices  of  H  and  W  reduce  the  resulting 
Bellman  equation  and  boundary  condition  to 
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Jt  =  min  E 

i  [x-f  v’-] 

Ai  A2 

X 

'n  /data  (t) 

2 

aJ  A3 

.V. 

X  (S2X  +  S3V)  -  (x^S2  +  v^S3)  GB 


+  J^x  +  J;v  +  trdNjNi  +  JN2N2  +  JN3N3) 


+  tr 


jJiixx  +Ji;vx  +jJc;vv  J  dt 


+  Mitr(JMjix  +  JM,vV)dt 


J(tf.-)=  E 

/data  (tf) 


for  M2  as  approximated  by  Equation  32,  where  subscripts 
denote  partial  differentiation. 


[x^(tf)  :  v^(tf)] 

Si  §2 

X(tf)1 

> 

.Sl  S3. 

.v(tf)J 

From  Equation  36,  the  indicated  partial  derivatives  are 

QiNi  +  Q2N2 

+  jmS3  jN3l  +  h^[x’^v’^]  tr  (A[Mi  i  M2])  +  f 


Jt  =  Y^^SiX  +  x^S2V  +  ■— v^S3V  +  h^  tr 


Ji=  x'^Si  +  v'^S2+  h^  tr”^  (AilMi  i  M2]) 


J;  =  x'^S2  +  v'^S3  +  h^  tr”^  (A2[Mi  i  M2]) 


(38) 

now 
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lN,=  h^(Q3  +  f-S3] 

Jii  =  Si 
Jiv  =  S2 
Jw=  S3 

JM,i  = 

Jmjv  =  h  A4  , 

where  A^,  A2,  and  A4  have  components  which  are  either  zero  or 
components  of  A .  Substituting  these  expressions  for  the  partial 
derivatives  in  Equation  38  and  using  the  fact  that  L  is 
approximately  a  deterministic  time  function,  which  from  Equations 
28  and  31  is  independent  of  the  perturbation  control,  allows  the 
conditional  expectation  in  Equation  38  to  be  evaluated  to  the  desired 
accuracy  with  Equations  28-35  to  give  the  minimand  of  Equation  38 
as 
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l{x’^(Ai  +  SaGB'^G'^Sjx  +  v'^(A3  +  S3GB"^G'^S3)v  +  ti'^Bti 
+  mh^  tr  [A3N3  +  S3(DN3  +  N3D'^)]} 

+  x‘^(A2+  S2GB"^G’^S3)v  +  (x'^Si  +  v'^S2)  (Fx  +  v) 

+  (x'^S2  +  v'^S3)  [(K-  GB-^G'^sJ)x  +  (D  -  GB-^g'^S3)v] 

+  mh^  tr  {Qi[N2  +  n'J  -  (m^Pi)R~*Ni  -  NiR"^(m^Pi) 
+  m^(MirR"^Pi  +  PiR'V’^m]')' 

X  (B-^G’^S^x  +  B"^G'^S3V  -  Ti)]} 

+  mh^  tr  [Q2(N3  -  m^PiR~*N2  -  m^NiR~^P2)] 

+  mh^  tr  {Q2(m^M2rR'^Pi  +  m^PjR-^r'^Mf)' 

X  [B"^G^(S2X  +  S3v)-t|]}  +  mh^  tr  [m^S2p2R~^Ni 

+  m^N’5R“^PiS2-  ni^Q3(PjR‘^N2  +  N2R~^P2)] 

+  mh^  tr  {m^Q3(P2R'^r'^M2+  M2rR"^P2)' 

X  [B'^G’^CS^x  +  S3V)  -  Ti]}  -  mh^  tr  {(m^S^PiR-^r’^Ml 
+  m^MirR'^P2S|)'  [B"^G’^(Sli  +  S3V)  -  ti]}  +  f(t). 


Equating  the  Tj  -derivative  of  this  expression  to  zero  gives  the 
minimizing  perturbation  control  as 
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n  =  mh^B-^tr  {rR"'[m^Pi(2QiMi  +  (QJ-  82)1^2) 

+  m^P2(2Q3M2  +  (Q2-S5Mi)]}  ,  (39) 

which  results  in  a  negligible  contribution  of  the  -dependent  terms 
in  the  minimand  to  -J^.  Collecting  the  remaining  terms  in  like 
powers  of  the  a  priori  random  variables,  evaluating  the  terminal 
boundary  expectation  in  Equation  38,  and  using  Equations  22-24 

and  the  fact  that  Ni  =  Ny  and  N3  =  N3,  show  that  the  Bellman 
Equation  38  is  satisfied  to  order  h^m^/Z  by  the  function  J  of 
Equation  36  if  f  and  the  8,  Q,  and  A  components  satisfy  the  terminal- 
value  system  of  ordinary  differential  equations; 

-Sj  =  8]?  +  F^8i  +  82K  +  K^82  +  Aj 

-82GB~^G'^8l;8,(tf)=Si  (40) 

-82  =  81  +  82D  +  F’^82  +  K'^83  +  A2 

-  82GB-^G’^83  ;  S2(tf)  =  82  (41) 

—83  =  82  +  S2  +  83D  +  D'^83  +  A3 

-  83GB'^g'^83  ;  83(tf)  =  S3  (42) 

-<)i  =  m|R-^  j(m^P2)  iSl- Q2)  -  (m^Pj)Qi 

+  j(S2-Q2)(ni^P2)'^-Qi(m^Pi)  R-j; 

Qi(tf)=0  (43) 
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-ih  =  m[(Sl-  Q2)  (m’p,)R-'  +  2Q2-  2Q3(m¥2)V‘]  ; 

Q2(tf)  =  0  (44) 

-03  =  -  sI  +  qI  -  S2  +  SjGB-'g'^Sj)  ;  Qjftf)  =  0  (45 ) 

-Aj  =  similar  expression;  A^Ctf)  =  0;  i  =  1,  4 

-f  =  similar  expression;  f(tf)  =  —  tr  [^SiPi(tf) 


+  2S2p2(tf)  +  Sjpjdf)], 

since  these  S,  Q,  and  A  components  will  be  of  order  unity.  As  a 
consequence  of  the  dynamic  programming  procedure  (Reference  6), 
the  optimal  perturbation  control  for  Equations  27  and  37  is  then 
given  to  order  h2m3/2  by  the  corresponding  r\  of  Equation  39. 

It  also  follows  from  differentiating  Equation  39,  from 
substituting  for  the  derivatives  in  the  resulting  expression,  and  from 
the  previously  established  orders  of  magnitude  for  the  quantities 
involved  that  the  time-derivative  of  the  t)  is  small  enough  that  it 
would  contribute  only  negligibly  to  the  Bellman  equation,  as  was 
assumed. 


IMPLEMENTATION  AND  EXTENSION 
Defining  the  matrices 

Cl  =  mV^Pi, 

C2=mV^P2, 

Y  =  -StGB-’g’^Ss, 

4  ^  ^ 
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s  =  T<S2  -  Q2)  +  -7(32  -  QI)  ( =  s’’). 

4  4 

and 

A  =  -i-(S|-Q2)-i(S2-Ql)  (  =  -a’') 

4  4 

and  using  Equations  43-45  gives 

-Qi/m  =  C2(S  +  A)  +  (S  +  A)Cl-  CiQi  -  QiCf, 
-^/m  =  2(Y  -  S), 

-s/m  =  i(Q3(5-  SCT  -  Q,)  +  j(C2Q3 
-CiS-Qi)  +  ^(C,A-ACT), 


and 


-A/m  =  -  j(AC|  +  CiA)  +  jCQsC?-  SCj- C2Q3  +  CjS). 

Since  Cj,  C2,  Y,  and  are  all  of  order  unity,  this  system  of 
differential  equations  settles  in  reverse  time  with  time  constants  of 
order  1/m  to  its  steady-state  solution  with 


A  =  0, 

S  =  Y, 

(46) 

CiQi  +  QiC|=C2Y  +  YCI, 

(47) 

and 
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Qi  =  QT=  1-^203 +  Q3CI-C,Y-YCI).  (48) 

By  the  preceding  definitions  and  Equation  39, 

Ti  =  2mh2B-^  tr  [HCiQi  -  C2S)Mi  +  r(C2Q3  "  CiS)M2]. 

From  Equations  46-48,  therefore, 

r\  =  2mh^B“^  tr  (rCT^C2YM2) 

to  order  h2m3/2^  except  within  order  1/m  of  the  terminal  time. 
From  the  definitions  of  Cj,  C2,  and  Y,  this  perturbation  control  is 

Ti  =  ^h^B‘^(rP7^P2S3GB"^G'^S3M2).  (49) 

The  final  result  can  be  summarized  more  conveniently  by 
absorbing  h  and  m  into  F  and  R,  so  that 

z  =  X  +  tr  (r'en*^)  +  GWN(R).  (50) 

Except  within  a  terminal  time  interval  of  order  1/m  (which  will  no 
longer  be  considered  here),  the  optimal  control  law  can  then  be 
approximated  to  order  h2m3/2 


u  =  -B  ^G^(S2X  +  S3V) 


+  jB'^  tr  (rP7‘P2S3GB-^G'^S3M2),  (5 1 ) 

where  S^,  S2,  and  S3  are  as  determined  by  Equations  40-42  and  x,  v, 
P|,  P2,  and  M2  are  generated  in  real  time  from  the  incoming 
measurements  z  by  the  differential  equation  system 
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X  =  Fx  +  V  +  (Pi  +  EiF^  u)R  ^(z  -  z)  ;  X  (0)  given 

V  =  Kx  +  Dv  +  Gu  +  (P2  +  E2r^  u)R'^(z  -  z)  ;  v  (0)  given 

§  =  (£]'+  LF*^  u)R‘^(z  -  z) ;  0  (0)  =  0 

Pi  =  FPi  +  PiF”^  +  P2  +  pI  -  (Pi  ---  EiF”^  u)R"^ 

X  [Pi  +  (EiF)'u]  ;Pi(0)=Pio 

P2  =  PiK"^  +  FP2  +  P^d”^  +  P3  -  (Pi  +  EiF”^  u)R"' 

X  [P2  +  (E2F)'u]  ;  P2(0)  =  0 

P3  =  KP2  +  pIk"^  +  DP3  +  P3D’^  +  Q  -  (P2  +  E2F’^  u)R"^ 

X  [P2  +  (E2F)'u]  ;  P3(0)  =  P30 

El  =  FEi  +  E2  -  (Pi  +  EiF”^  u)R‘^[Ei  +  (LF)'u]  ;  Ei(0)  =  0 

]^  =  KEi  +  DE2  -  (pI  +  EzF"^  u)R-^[Ei  +  (LF)'u]  ;  E2(0)  =  0 

L  =  -(E]”  +  LF”^  u)R'HEi  +  (LF)'ul  ;  L(0)  =  Lq 

Ml  =  -PiR'^Mi  -  M2  +  (LF)'B'^G'^S3P2R"^(z  -  i) ; 

Mi(0)=0  (52) 

M2=  pJR'^Mi  ;  M2(0)=  0  (53) 

with 

z  =  X  +  tr  (F"0u) 

and  u  as  given  by  Equation  51. 
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This  result  can  be  extended  to  a  more  general  case.  In 
preceding  context,  with  c  and  e  denoting  the  composite  vectors 


the 

X 

V 


and 


X 


V 

e 


,  respectively,  the  dynamics  in  this  extension  are  of  the  form 


c  =  Fc  +  Gu  +  tr  (Aee^)  +  tr  (Qeu^)  +  (I  +  'Fe) 


0  =  (mh)w2 


0 

w 


(54) 


where 

form 


F  r 

'o' 

and  G  = 

.K  D 

.G 

the  state  measurements  are  of  the 


z  =  X  +  tr  (rOu^)  +  tr  (Acu^)  +  tr  (Oee^)  +  n,  (55) 

and  the  criterion  to  be  minimized  is  of  the  form 

r 

Cf  (S  +  nCf)Cf  +  J  j^a^u  +  c^(A  +  5e)c 

+  u^(B  +  2e  +  ©u)u  jdt 

Here,  B(t)  is  symmetric  and  positive-definite;  Sj,  A(t),  and  the 
covariance  parameter  of  W2  are  symmetric  and  positive- 
semidefinite;  the  components  of  a(t),  A(t),  B(t),  B'^t),  S,  and  the 
covariance  parameter  of  W2  are  of  order  unity;  all  the  components  of 
the  three-way  matrices  are  of  order  h;  and  W2  is  statistically 
independent  of  w,  h,  and  the  prior  distribution.  This  is  a  special 
case  of  the  class  of  control  problems  treated  to  order-h  accuracy  in 
Reference  7  for  R  and  R'^  of  order  unity,  where  R  now  denotes  the 
covariance  parameter  of  n  itself  (and  so  is  of  order  m-4  here). 
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Adapting  the  derivation  of  Reference  7  to  the  more  accurate 
measurements  here  and  retaining  any  additional  terms  affecting  the 
result  to  order  h2m3/2^  simply  add  the  second  "dither"  term  of 
Equation  51  to  the  control  law  of  Reference  7  for  this  case.  M2  is 
generated  by  Equations  52  and  53,  where  Pj  and  P2  denote  the 
corresponding  xx^  and  xvT  covariance  matrix  partitions  of  the 
standard  extended  Kalman  filter  for  Equations  54  and  55  and  where 

z  =  X  +  tr  (r"0u'^)  +  tr  (Q"ex'^). 


EXAMPLE  .  MISSILE  PITCH  AUTOPILOT 
(All  variables  are  scalars  in  this  section) 

The  attitude  dynamics  of  a  missile  in  its  pitch  plane  are  often 
approximated  as 

a  =  q-  g/Vm  (56) 

q  =  Aa+B5  (57) 

with 

g  =  Fa+H5,  (58) 

where  (see  Figure  1) 

a  =  angle  of  attack 
q  =  pitch  angle  rate 

g  =  missile  acceleration  (in  pitch  plane)  normal  to  its 
body  axis 

=  missile  airspeed 


5  =  fin  deflection  angle 
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and  where 


A  =  ^C„«. 


p  Qsd  p 

°  ^m5 


Qs  c_ 


with 


Q 


=  dynamic  pressure  (— V 


2 

m 


X  atm.  density) 


s  =  missile  cross-sectional  area  parameter 
d  =  missile  length  parameter 

I  =  missile  rotational  moment  of  inertia  (in  pitch  plane) 
M  =  missile  mass, 

and  Cpj^,  Cjjjg,  and  Cjj§  are  the  usual  "aerodynamic  derivatives." 
These  areodynamic  derivatives  are  generally  treated  as  constants 
for  any  given  missile,  although  in  reality  they  depend  at  least 
weakly  on  Mach  number,  angle  of  attack,  and  other  variables.  The 
fin  deflection  5  is  considered  the  control  variable  here,  and  the 
controller  is  assumed  to  have  measurements  only  of  the  current 
normal  acceleration  g.  Measurements  of  the  pitch  rate  q  also  could 
be  obtained  from  gyroscopes,  but  such  additional  instrumentation 
would  add  to  the  complexity  and  fragility  of  a  missile.  Hence,  it  is  of 
interest  to  see  what  can  be  done  without  it.  The  dynamic  system  in 
this  formulation  would  be  that  of  Equations  56  and  57,  with 
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Equation  58  substituted  for  g  in  Equation  56  if  A,  B,  C,  D,  and  are 
treated  as  known  quantities. 

The  objective  in  designing  the  control  law  (autopilot)  is  to 
make  the  actual  acceleration  g(t)  follow  any  reasonable  commanded 
history  c(t).  Equilibrium  conditions  can  be  found  for  constant  c  by 

solving  Equations  56-58  with  a  =  q  =  0  and  g  =  c,  which  gives 


-  r  B 

(59) 

Ifb- ahJ^’ 

II 

o 

3 

(60) 

and 

8=f — - Ic  (61) 

Iah-fb;  ^  ^ 

for  the  corresponding  values  of  a,  q,  and  5.  A  simple  option  would 
be  to  use  Equation  61  as  an  open-loop  control  law,  using  nominal 
values  of  A,  B,  F,  and  H.  However,  missiles  are  typically  so 
underdamped  that  this  does  not  work  well  even  at  the  nominal 
speed  and  altitude  to  which  these  values  correspond  (see  Figure  2a). 

In  this  context,  however,  if  one  defines 


X  =  a  -  a, 

V  =  q  -  gA^n,, 
u  =  5  -  5, 


and  the  measurement  variable 


-^(5-8). 

F 


(62) 

(63) 

(64) 

(65) 
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then  it  follows  from  Equations  56-61  that 


X  =  V, 

(66) 

V  =  Ax  +  Bu, 

(67) 

and 


z  =  x  (68) 

for  constant  c  at  the  nominal  conditions.  Since  g  is  mainly  the  result 
of  body  lift  for  a  missile  (i.e.,  F  »  H),  it  is  approximately  proportional 
to  a.  Hence,  it  is  reasonable  to  seek  a  control  law  for  which  x 
(deviation  of  a  from  its  equilibrium  value  for  the  commanded 
acceleration  c)  behaves  as  a  high-frequency  critically  damped 
sinusoid.  If  the  full  state  (x,  v)  could  be  measured,  the  control  law 
that  minimizes  the  criterion 


,G>0  (69) 

for  the  system  of  Equations  62  and  63  with  white  noise  added  to  v 
can  be  found  by  standard  methods  (Reference  8).  As  long  as 
G«  I  B/A  I ,  this  control  law  is  approximately 


for  tf  -  t  »  *\J  G !  |B  1 ,  i.e.,  for  any  fixed  t  as  the  criterion  is  changed 
so  that  tf-»  «»,  Substituting  Equation  70  for  u  in  Equation  67  shows 
that  X  behaves  as  a  damped  sinusoid  with  natural  frequency 

G  =  [(B/G)^- 
and  damping  ratio 


J  =  E 


X 


(x^  + 


G^u^) 


dt 
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OPEN-LOOP  CONTROL  NON-ADAPTIVE  AUTOPILOT 


0.1 

0 

0.1 


MACH  1.5 
eokft. 


MACH  2 

20kft. 

(nominal) 


MACH  3 
3  kft. 


FIGURE  2.  Response  to  1g  Step  Command. 
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WITHOUT  DITHER 


WITH  DITHER 


(A 

b> 
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cc 

LU 

-J 

LU 

O 

o 

< 


MACH  1.5 

60  kft. 


MACH  2 

20  kft. 
(nominal) 


MACH  3 
3  kft. 


FIGURE  3.  Adaptive  Autopilot  Response. 
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For  G  small  enough  that  G  »  ^  |  A  |  ,  1/V2  (critical  damping)  and 

X  behaves  as  desired.  Under  these  conditions,  =  'sj  1 B  1  /G  ;  thus, 
using  the  control  law 

u  =  -(Q^x  + VT’Qv)/B  (71) 

would  make  x  behave  approximately  as  a  critically  damped  sinusoid 
of  frequency  G  as  long  as  Ci  »  1 A  | . 

Only  the  x-component  of  the  state  (x,  v)  is  measured  directly, 
however,  so  v  must  be  estimated.  One  way  of  doing  this  is  to 
replace  x  and  v  in  Equation  71  by  the  estimates  produced  from  the 
measurements  of  Equation  65  by  the  Kalman  filter  corresponding  to 
Equations  66-68,  with  white  noises  added  to  Equations  66  and  68. 
If  the  respective  variance  parameters  of  these  noises  are  qp  and  r, 
this  filter  will  have  a  settling  time  x  of  about  (r/qp)U4  (see  below). 
This  settling  time  completely  determines  the  effect  of  qp  and  r  on 
the  filter  estimates  and  should  be  chosen  so  that  fix  <  1  for  the 
purpose  of  using  these  estimates  in  Equation  71. 

Finally,  some  of  the  approximation  errors  can  be  canceled  out 
by  using  the  postulated  dynamics  to  replace  the  estimated  value  of 
X  in  this  control  law  by  equivalent  quantities  involving  the  actual 
and  commanded  values  of  the  normal  acceleration,  which  are 
actually  known  directly.  From  Equation  58  and  the  definitions  of  d 
and  5, 


g  -  c  =  F(a  -  d)  +  H(5  -  8). 

But  from  the  definitions  of  x  and  u,  the  control  of  Equation  71  is 
8  -  S  =  -[Q^(a  -  d)  +  VT" Qv]/B , 
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where  v  denotes  the  Kalman  filter  estimate  of  v,  so  it  can  be 
expressed  as 


5  -  6  =  -{Q^[g  -  c  -  H(5  -  5)]/F  +  VT" 

Solving  for  5  and  substituting  from  Equation  61  for  5  then  gives 


Ac  _  Q^(g  -  c)  +  FQv 
AH  -  FB  PB  _  ^2^ 


(72) 


as  the  desired  control  law,  where  the  role  of  the  Kalman  filter  is  now 

limited  to  providing  v  from  the  derived  measurements  of  Equation 
65,  which  from  Equation  61  can  be  expressed  as 


g-H6  Be 
F  "^AH-FB’ 


(73) 


This  is  essentially  the  concept  of  plant  inversion  via  state  feedback 
described  in  Reference  9. 

A  control  law  of  this  type  performs  well  at  the  nominal 
conditions  for  which  it  is  designed  (see  Figure  2b),  as  might  be 
expected  from  its  use  of  feedback.  It  can  still  perform  badly, 
however,  if  the  missile  speed  and  altitude  are  very  different  from 
these  nominal  values  (see  Figures  2c  and  2d).  This  shows  the  need 
for  an  adaptive  extension  of  such  a  control  law. 

It  was  found  empirically  that  the  dynamic  pressure  Q  and,  to  a 
lesser  extent,  the  aerodynamic  derivative  C^j  „  are  important 
parameters  to  estimate  adaptively.  For  this  purpose,  it  is  preferable 
to  use  ln(Q)  as  the  dynamic-pressure  parameter,  since  it  can 
legitimately  have  a  Normal  distribution;  also,  it  is  preferable  to  use 
Equation  58  to  eliminate  a  in  Equation  57  so  that  g  can  be  used 
there  as  a  directly  known  quantity  to  cancel  out  additional 
approximation  errors.  (This  latter  stratagem  did  not  help  in  the  case 
of  the  simple  nonadaptive  autopilot.)  If  variations  in  the  other 
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aerodynamic  derivatives  are  ignored,  this  allows  the  dynamic 
system  to  be  expressed  as 


a  =  q  -  gA^n,, 

(74) 

q  =  (b  -  vAH/  F)e®5  +  +  Wp, 

(75) 

F 

0  =  Wq, 

(76) 

> 

II 

(77) 

where  the  lateral  acceleration  g  is  treated  as  a  known  quantity.  A, 

B,  F,  and  H  are  evaluated  at  some  nominal  missile  altitude  and 
airspeed  Also, 


0  =  ln(Q/Q) 


for 


Q=Q 


On  a  ~  On  a 


at  the  nominal  altitude  and  airspeed. 


and  Wp,  W0,  and  w^  are  independent  noise  processes  introduced  for 
a  realistic  degree  of  uncertainty.  The  variance  parameters  used  for 
these  noises  are,  respectively. 


Qp,  left  as  a  design  parameter 
q0  =  (prior  variance  of  0)/T 
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=  (prior  variance  of  \)/)/T, 

where  T  is  the  time  scale  over  which  the  parameters  can  be 
expected  to  change  drastically  because  of  altered  flight  conditions. 

The  control  optimization  approach  used  above  for  the  nominal 
flight  conditions  can  be  extended  to  the  present  situation  by 

considering  d,  q,  5,  x,  v,  and  u  of  Equations  59-64  for  the  actual  but  a 
priori  unknown  flight  conditions  and  the  variable  z,  defined  as 

z  =  -^^-^(5-6).  (78) 

F  F 


where  F  and  H  are  still  the  values  of  F  and  H  at  the  known  nominal 
condition.  Then  it  follows  from  Equations  59-61,  74,  and  75  that 
(for  a  constant  commanded  acceleration  c), 

x=v,  (79) 

V  =  e®Bu -t- y— (g  -  c  -  He®u)  +  Wp,  (80) 

F 


and 


z  =  e®x  +  — (e®-  Du.  (81) 

F 

Consider  choosing  the  autopilot  design  frequency  Q.  so  that  Q  »  |  A  | 
for  any  "reasonable"  flight  condition.  If  z  could  be  measured  and  6 
computed  from  u,  the  desired  autopilot  behavior  could  then  be 
approximated  by  generating  u  with  the  control  law  that  minimizes 
the  criterion  69,  with  G  =  1b  |/Q,  for  the  dynamics  of  Equations  76, 
77,  79,  and  80  (see  discussion  of  Equations  69-71).  Since  5  is 
defined  in  terms  of  the  unknown  actual  flight  conditions,  the 
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controller  cannot  really  measure  z  or  determine  the  actual  control  5 
from  u.  However,  5  is  a  rather  small  quantity.  As  an  expedient 
approximation,  the  optimal  control  law  for  u  is  determined  as  if  8 
(but  not  A,  B,  F,  and  H)  were  known  and  then  added  to  an  estimate 

of  8.  For  this  purpose,  A  is  also  ignored  in  Equation  80  as  relatively 
small,  T  is  assumed  large  enough  that  Wg  and  w^  in  Equations  76 
and  77  can  be  ignored,  and  a  low-intensity  noise  n  is  added  to  z  of 
Equation  81  to  provide  a  realistic  degree  of  uncertainty.  Then,  for 
small  6,  this  control  problem  becomes  approximately  that  of 
minimizing 


J  =  Ej  x^+  ~  (l  +  2e)u^dt^  (82) 

.  ‘oL  J  . 

(with  tf  large)  for  the  dynamics 

x=v,  (83) 

V  =  Bu  +  B0U  +  Wp,  (84) 

0=0,  (85) 

and  the  state  measurements 

H 

z  =  X  +  0x  +  — 0u  +  n.  (86) 

F 


The  variance  parameter  of  n  in  Equation  86  is  taken  as  x'^qp,  where  x 
is  some  specified  time  constant  such  that  fix  «  1. 

Since  x  is  small.  Equations  82-86  become  an  optimal  control 
problem  of  the  form  analyzed  above  for  0  ->  0.  Applying  the  results 
developed  there  shows  that  (away  from  the  terminal  boundary  at 
tf)  the  optimal  control  law  is  approximated  asymptotically  by 
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ft 


.2  ^ 


B(i  +  e) 


—  "H  . 
F 


(87) 


where  r[  in  the  second  control  term  (called  dither)  is  generated  by 


Ti  =  y/x 


Y=7 


—  L(z  -  z)  - 11  -  VT” y 


(88) 


and  where  x,  v,  and  0  are  the  approximate  conditional  expectations 
of  X,  V,  and  0  produced  by  the  extended  Kalman  filter  for  Equations 

83-86,  L  is  the  corresponding  conditional  variance  of  0,  and  z  is  the 
approximate  conditional  expectation  of  z,  namely 

z  =  (1  +  0)x  +  —  0u.  (89) 

F 

To  convert  this  result  into  a  feasible  control  law  as  a 

sophisticated  user  might,  the  quantity  (1  +  0)  is  replaced  by  e®  for 
robustness  and  the  reasoning  used  in  deriving  Equation  72  from 
Equation  71  is  applied  to  Equation  87  to  obtain 

5  _  !  ~  -  (Q^H/ B)ii  (90) 

ra-ft^H 


where 


6  = 


Ac 


AH-FB 


(91) 
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A  =  Aye® 
B=  Be® 
F=Fe® 
H=He® 


(92) 


V  =  q  +  J 


^  ^  A 

and  where  q,  0,  and  y  are  the  conditional  mean  estimates  for  q,  9 , 
and  y  generated  by  the  extended  Kalman  filter  for  the  more 
accurate  equations  of  motion  (Equations  74-77).  L  in  Equation  88  is 
likewise  the  conditional  variance  of  6  from  this  more  accurate  filter. 
The  measurement  for  this  more  accurate  filter  is  constructed  as 


F 


(93) 


so  that,  with  the  same  measurement  noise  added,  it  follows  from 
Equation  58  and  the  definition  of  9  that 


z  =  e®a  +  —  (e®  -  1)  5  +  n.  (94) 

F 

Also,  its  (approximate)  conditional  mean  is  therefore 

z=  e®a+ ^(e®- 1)5.  (95) 

F 

In  summary,  the  adaptive  control  law  derived  in  this  way  is 
that  specified  by  Equations  88,  90-93  and  95  with  v,  0,  y,  and  L 
(which  is  var(0))  as  generated  from  the  measurements  of  Equation 
95  by  the  extended  Kalman  filter  for  Equations  74-77  and  94. 
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Figure  3  shows  the  performance  of  this  control  law,  both  with  and 
without  the  dither  term  containing  ri,  in  a  realistic  missile  and 
aerodynamic  simulation.  The  missile's  fin  deflection  5  was  limited  to 
±25  degrees,  so  the  control  laws  actually  used  saturated  at  those 
values.  Three  flight  conditions  were  simulated: 

1.  3,000-ft.  altitude  at  Mach  3 

2.  20,000-ft.  altitude  at  Mach  2  (nominal) 

3.  60,000-ft.  altitude  at  Mach  1.5. 

The  commanded  acceleration  c  in  each  case  was  the  step  function 
indicated  in  Figure  3.  These  flight  conditions  covered  a  dynamic 

pressure  variation  over  a  factor  of  65  as  well  as  a  factor  of  2 

variation  in  Mach  number.  For  comparison.  Figure  2  shows  the 
corresponding  performance  of  the  nonadaptive  version  of  this 
control  law  and  also  that  of  the  open-loop  control  law  for  the 

nominal  flight  condition  only.  The  nonadaptive  autopilot  was  clearly 
a  failure  at  flight  conditions  1  and  3  (note  the  scale  changes  in 

Figures  2b  and  2d),  although  it  and  the  adaptive  autopilot 
performed  almost  identically  at  the  nominal  condition  2.  It  was 

always  helpful  to  use  the  dither  control  component  in  the  adaptive 

autopilot,  but  its  effect  was  barely  noticeable  except  at  the  high- 

altitude  flight  condition  3,  where  the  time  to  adapt  to  the 
nonnominal  flight  parameters  was  reduced  by  one-half.  The 

dynamic  pressure  was  so  low  at  flight  condition  3  that  the  simulated 
missile  needed  a  10  degree  angle  of  attack  to  achieve  even  the  1- 
gravity  limits  of  the  commanded  normal  acceleration. 

The  operation  of  the  adaptive  autopilot  is  displayed 
schematically  in  Figure  4,  where  the  definitions 


P  =  C„5/C„„ 


and 


> 


r  —  Cjns/C 


ma  J 


(at  the  nominal  flight  condition) 


are  adopted  for  convenience. 
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FIGURE  4.  Adaptive  Missile  Autopilot  Operation. 
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